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Abstract 

We introduce a coarse-grained model of DNA with bases modeled as rigid-body ellipsoids to cap- 
ture their anisotropic stereochemistry. Interaction potentials are all physicochemical and generated 
from all-atom simulation/parameterization with minimal phenomenology. Persistence length, degree 
of stacking, and twist are studied by molecular dynamics simulation as functions of temperature, salt 
concentration, sequence, interaction potential strength, and local position along the chain, for both 
single- and double-stranded DNA where appropriate. The model of DNA shows several phase tran- 
sitions and crossover regimes in addition to dehybridization, including unstacking, untwisting, and 
collapse which affect mechanical properties such as rigidity and persistence length. The model also 
exhibits chirality with a stable right-handed and metastable left-handed helix. 
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I INTRODUCTION 

DNA is likely the most well-studied biomolecule, with structural, energetic, and kinetic characteri- 
zation spanning over 6 decades of research since its isolation as the carrier of genetic information by 
Oswald Avery and co-workersu. Elucidating the varied behaviors of DNA has been primarily an exper- 
imental endeavor, due in large part to the difficulties in capturing the molecule's complex motion and 
function either computationally or theoretically. The computational difficulties arc primarily due to 
the fact that much of the interesting behavior takes place on time-scales 3-6 orders of magnitude longer 
than the longest all-atom simulations of a system comparable to the typical size of a gene (currently 
nanosecondscl) . r-. 

For example, RNA polymerase transcribes DNA-at a rate of about 14ms/nucleotide in Eukaryotescl 
during elongation, with comparable rates in ELjColicl. These rates are further slowed by transcriptional 
pausing to regulate arrest and terminationEBQ by ~seconds per pause. Even 'ffeft" processes such 
as bacteriophage DNA ejection have translocation times greater than 10/Lts/bpEm. Time-scales for 
DNA packaging into the viral capsid arj£-i~ 10ms/bpt3. Nucleosome condensation time-scales at in 
vivo histone concentrations are ~ lOmscil. To address any of these biologically relevant phenomena 
computationally currently requires the introduction of coarse-grained models. The choice of coarse- 
grained DNA model reflects the empirical phenomena the model intends to capture. For example, a 
description of Zinc-finger protein binding to DNA would require an accurate representation of major 
and minor grooves, whereas a description of the sequence-dependence of nanopore translocation would 
require an accurate description of the stereochemistry of bases. As in the above examples, these 
phenomena also exhibit slow Jdnctics compared to time-scales accessible computationally. Binding 
rates for transcription factorsEJ are on the order of 1/ms per promoter at a cellular concentration of 
~ 10 6 /nuclcus, and translocation times for .single-stranded DNA at typical experimental voltages are 
on the order of 100/zs for a lOObp sequencelii 

On the coarsest scale, a piece of double-stranded DNA may be described as a semiflcxiblc polymer 
using the wormlike chain model. On this level, the only parameters characterizing the molecule are 
its total length and its bending rigidity k, which determines the persistence length l p = n/k B T. 
This model is very successful in capturing chain properties on scales larger than £ p , for instance 
the force-extension relationO, but carries no information about the internal structure of the double 
strand. Slightly more refi ned piodels approximate single-stranded DNA as a semiflexible chain of 
sterically repulsive spherestSta, which may carry charge and thus interact with an external field, 
although Coulombic monomer-monomer interactions were neglected in these models. One bead per 
nucleotide representations of DNA have also been used to describe supercoiling and local denaturation 
in plasmidsl!3. Bead-spring models of double-stranded DNA with nonlinear interchain coupling through 
hydrogen bonds have been used to study vibrational energy transport and localization!!!! An extension 
of the bead-spring idea, where base-pairs were represented by a planar collection of 14 harmonically- 
coupled beads—allowed for an investigation of spontaneous helix formation from initial ladder-like 
conformations!!^. More detailed representations have been proposed that describe DNA on the level 
of sphericallyTjSvmmctric base monomers attached to a chain of similar monomers representing the 
backbone E2lE![E3. In these approaches base molecules and repeat units on the backbone are modeled 
as spheres that interact with other bases through phenomenological potentials mimicking van der 
Waals and hydrogen bonds. On this level of resolution, it is for example possible to study thermal 
denaturation of DNA, as well as mechanical properties such as bending rigidity. Zhang and CollinsE2l 
treated base-ribose moieties as a single rigid body with residues at the positions of hydrogen bonding 
heavy atoms. While the rigid bodies were constrained to undergo two-dimensional motion, base-pairing 
through a modified Morse potential allowed for a systematic study of the structural changes during 
thermal denaturation. More recently, Knotts et al.c3 introduced a refined bead-spring model that 
distinguishes between sugar and phosphate groups on the backbone and individual base molecules. 
After parameter adjustment which included introducing a Go-like potentials to bias the system to 
the crystal structure, the model was able to describe several features of DNA physics such as the 
dependence of £ p and duplex stability on ionic concentration, as well as the dynamics of thermally 
activated bubble formation in hybridization. 

For phenomena such as protcin-DNA binding and DNA translocation, the base stereochemistry 
characterizing the DNA sequence is of particular importance. To this end, we introduce a model of 
DNA where phosphate and sugar groups are represented by one coarse-grained (spherical) residue, 
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and bases by a rigid-body ellipsoid. Sterically, bases of DNA more accurately resemble a flat plate 
than a spherical object. Energetically hase stacking interactions, predominantly governed by electron 
correlation (van der Waals) interactionscj, play a significant if not dominant role in the stability of the 
double hclixEj. We adopt a systematic coarse-graining approach, where we parameterize the effective 
interactions through all-atom simulations wherever possible. The interactions describing the stacking of 
bases arc optimized against a fully atomistic representation of the base residues. Interactions describing 
covalent bonds, angles and dihedrals along the DNA strand are obtained from equilibrium simulations 
of a short all-atom strand of DNA. One central difference in the present model from previous models is 
therefore the absence of any structure-based potentials, i.e. we do not use effective Go-like potentials 
to bias the system toward an experimentally determined structure. All behavior in the present model 
is a direct consequence of physicochemical interactions and the systematic parameterization procedure. 
The definition of the model and details of our coarse-graining scheme are described in the subsequent 



Model section and in the Supplementary Material (SM can be found at tittp : //www. physics .ubc . 



ca/~steve/publication/23-suppmat .pdf ). We then investigate the predictions of the model for 
several important properties of DNA, in particular persistence length £ p and radius of gyration R g of 
both double and single stranded DNA. We introduce intuitive and quantitative methods to calculate 
both helical twist and the stacking of bases as order parameters characterizing the degree of native 
DNA structure. The behavior of these quantities is quantified as various internal and environmental 
factors are varied, including sequence, interaction strength, strand length, temperature, and ionic 
concentration. 



II MODEL 




FIG. 1: Our coarse- graining scheme overlaid onto an all atom visualization. Panel (A) shows a section 
of 3 nucleotides: spherical sugar (S), and phosphate (P) residues, together with ellipsoidal bases. Panel 
(B) shows a close up of a base with ellipsoid overlay. The axes indicate the principal axes of the base, 
and are aligned with the principal axes of the ellipsoid, which determine the base's initial orientation. 



A Parameterization of the model 

Fig. [I] depicts a piece of DNA in our coarse-grained model superimposed on the all-atom representation. 
Alternating sugar (S) and phosphate (P) groups are replaced by spherically-symmetric residues. The 
sugar residue is located at the position of the CI' atom in standard PDB notation, which is connected 
to the base by a single covalent bond. The phosphate residue is located at the center of mass of the 
phosphate group (PO4). The bases Adenine, Cytosine, Guanine, and Thymine are represented as 
ellipsoidal objects whose structure and interaction potential are described below. 

An immediate problem in coarse-grained systems is the loss of knowledge regarding the Hamilto- 
nian or energy function governing dynamics in the system. In all-atom classical molecular dynamics 
simulations, this problem is resolved by characterizing quantities such as partial charges and interac- 



tion potentials with empirical parameters. 
ab initio quantum mechanical target datafc 



erived from best fit to combinations of experimental and 
. We wish to adopt a similar methodology with respect to 



the extraction of phcnomcnological parameters here, by characterizing our coarse-grained system in 
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terms of effective parameters derived from all-atom simulations. The potential energy function for our 
system has the following form: 

V = V R E*(B 1 ,B 2 ) + V RB 2(B 1 ,Tes 2 ) + VHB(B 1 ,B 2 ) + VLj(r ss ,r sp ) + Vc(r pp ) 

+ V bond (r) + V ang l e (6) + Vdihedrali^) (1) 

The individual terms in Eq. (|l]) are described below. 



B Base-base van der Waals interactions 

The base-base interaction potentials Vre2(Bi, B2) must account for the shape and relative orientation 
of thc.anisotropic ellipsoids representing the bases. We .adopt the functional form derived by Babadi 
et AO, which is a modification of Gay-Berne potentials E3 to better capture the long-distance conver- 
gence to all-atom force fields. This so-called RE 2 potential has somewhat complicated form and is 



summarized in the Appendix AI. Describing the effective ellipsoids characterizing the bases involves 
several geometrical and energetic parameters. These are the three "half-diameters" for each base 
along principal axes (a x ,a y , a z ), three corresponding "reciprocal well-depths" for each base (e x ,e y , e z ), 
a parameter a c characterizing the interaction range between atoms in the all-atom potential, and the 
Hamaker constant A12 = 47re L jp 2 o' 6 in Lennard- Jones (LJ) units (given by Eq. (||)), where p is the 
number density or reciprocal of the effective volume of each ellipsoid. The above 14 parameters are 
found for the 10 possible base-base interactiops (A-A, A-C, etc..) by best fit between the RE 2 func- 
tional form and the all-atom MM3_force ficldl 2 ^ with Buckingham exponential-6 potentials, following 
the parameterization procedure inE3. The resulting values are used in V RB 2(Bi,B2) in Eq. ([[]), and 
are summarized in Table SI in SM. 

Though the values in Table SI are essential to describe the potential, it is more useful intuitively 
to determine the half-diameters of the bases by taking two identical bases and aligning them along the 
three principal axes. Two identical bases are brought together along a principal axis until their RE 2 
potential is equal to lk B T and 10fc B T. The distances that result are measures of the energetically- 
determined effective diameters of the ellipsoid representing the base, and are tabulated for all residues 
in Tables SII and SIII in SM. The distances determined in this manner from the RE 2 equipotentials 
should correlate with the size of the bases as determined other independent measures such as the 
effective hydrodynamic radii of the bases, which can be extracted from all-atom simulation studies of 
the diffusion of a base in water. This comparison is described below and in the Appendix. 



C Base-Sugar and Base-Phosphate van der Waals interactions 

Bases may interact as well with phosphate or sugar residues according to so-called sphere-asphere 
potentials V R E 2 (^ii res 2), which are a limiting case of RE 2 interactions when one entity is spherical. 
The form of this potential is shown in Eqs. ( |A . 1 2) - [A~13 ) . Computation of sphere-asphere potentials is 



more efficient than the full RE 2 potential. We take RE 2 parameters between the base in question and 
a sphere of the LJ radius a. These parameters are given in Tables SIV and SV for the interactions of 
bases with phosphates and sugars respectively. Basc-phosphatc potentials are derived from fits to the 
MM3 potential in the same manner as base-base interactions, and result in well-depths between about 
0.2 and 0.7k B T, Base-sugar interaction parameters are selected so that their corresponding potentials 
are nearly purely repulsive. This results in small values of the energy parameters, and an interaction 
radius of 2 A (see Tables SIV and SV). 



D Sugar-sugar and sugar-phosphate van der Waals interactions 

Sugars may interact with non-local sugar and phosphate residues that happen to be in spatial proximity. 
We model their interaction with a Lennard- Jones potential 

V L j(r) = 4e LJ [(a/r) 12 - (a/r) 6 ] (2) 

between residues j > i + 2. We use a well depth of e LJ = 0.25 kcal/mol and a distance of a = 2A. This 
potential mainly serves to prevent steric overlap of backbone residues. 
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E Base-base hydrogen bonds 




FIG. 2: Explanation of the hydrogen bond potential Eq. (||). The left image shows two bases aligned 
in the minimum energy configuration, and the right image shows them in a higher energy configuration 
where r ^ tq and 0\ ^ #2 7^ 0. 

Bases in DNA may pair by hydrogen bonding, an interaction not accounted for by the RE 2 po- 
tential, which is based upon atomic van der Waals interactions. We add this in explicitly using a 
phenomenological potential with a form generalized from that used in all-atom studiesE3: 

(cos 4 (30i) cos 4 (30 ?; ) + cos 4 (30,) cos 4 (3^-)) (3) 

The hydrogen bond potential for a Watson-Crick pair can be interpreted as the sum of two potentials 

(i) (j) 

Vfj B + Vjjg, representing the hydrogen bonds from base i — > j and base j — > i respectively. The 

potential can be best understood from Fig. ||B. V bb is a three-body potential between the center of 
mass of base i (r,), a point on the exterior of the ellipsoid considered to be the origin of the hydrogen 
bond (r A ) and a point on the other base (r B ) corresponding to either donor or acceptor. The hydrogen 
bonding in the other direction is similarly defined. We take r A and r B to be the unique points along 
the line segments joining ri and r2 which intersect the surfaces of the two ellipsoids when the_bases 
have the positions determined by the standard coordinates of the B isoform structure of DNAtil. In 
Eq. (|J), r = \r A — r B and r N = |r A — r B | in the B isoform structure (see Fig. |i|B). 

The angle 6 in Eq. (^) is defined in Fig. ^B. The angle <j> in Eq. (J3J) captures the empirical fact that 
multiple hydrogen bonds between bases results in a rigidity to fluctuations that would prevent base 2 
from rotating about p 2 for example. Let ri! and n 2 be the normals of bases i and j respectively. The 
^-dependent terms breaks the symmetry of rotations of base i about pi = r A — ri and rotations of j 
about P2. In Fig. |l|B, these normals are all pointing out of the page for simplicity. For the case of fa we 
project the normal 112 onto the plane perpendicular to pi and <fr will be the angle between this projected 
vector and ni: cos(fa) = ni • "^Z"^p^ ■ We perform the symmetric calculation (i <->■ j, 1 «-> 2) to get 
4>j ■ Since the ^-dependent terms in Eq. (^) already provide rigidity against fluctuations such as those 
shown in Fig. [I]b (right) as well as to buckling out of plane, we found the above formulation superior 
to simply using ni • n 2 which would "overcount" the potential cost of those fluctuations. 

As in all-atom hydrogen bond potentials, raising the cosines in Eq. (^) to the power of 4 ensures 
that the potential is strongly orientation dependent. The factor of 3 inside the cosine is a result of the 
fact that we apply an angle cutoff of 7r/6 to both 9 and (j). Therefore, when either angle reaches 7r/6, its 
contribution to the hydrogen bond potential vanishes. This also ensures a well-depth e H B) and avoids 
multiple minima in the cosine terms. The equilibrium separation r N is chosen so that the minimum of 
the potential matches the correct equilibrium separation in the crystal structure. 

Energy well depths must account for the fact that when a base pair hydrogen bond is broken, the 
two bases can form hydrogen bonds with the solvent. In implicit water, the well depth of the hydrogen 
bond represents the free energy difference between bonded and unbonded base pairs in. solvent, which 
has been determined experimentally to be 1.2 ± 0.4 kcal/mol for each hydrogen bondEl Additionally, 
there is coopcrativity in the energies of three hydrogen bonds, which makes G-C bonds more than 
1.5 times stronger than A-T bonds. The stability of a base pair is also reduced by the presence of 
neighbors in the crystal structure, which induce strain on the base pairE3, and reduces the value from 
that of an isolated base pairEJ, to stabilities of 10.2 kcal/mol and 17.2 kcal/mol for A-T and G-C pairs 
respectively in vacuo. The ratio between these two hydrogen bond well depths is 1.686, which when 
applied to the reasonable estimate of 2.4 kcal/mol for the hydrogen bond strength of A-T pairs in 
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water, gives an estimate of 4.0 kcal/mol for the hydrogen bond strength of G-C pairs. We therefore 
take 2.4 and 4.0 kcal/mol as our well depths e UB for A-T and G-C hydrogen pairings respectively. 

We allow hydrogen bonds between any pair of bases that satisfies the Watson-Crick pairing re- 
quirement. Due to the fact that our hydrogen bond potential is strongly orientation and position 
dependent, we found that it was very rare that a base shared a bond with more than one base at a 
time: on average, about 95% of the hydrogen bond energy was concentrated between the putativcly 
bonded base pairs. The average base-base stacking energy at typical temperatures and ionic concen- 
trations is approximately -1.6 kcal/mol per pair of stacked bases. For the same conditions we find 
an average hydrogen bond energy per hybridized base pair of -3.2 kcal/mol. Because a double helix 
of length N has N base pairs and 2N — 2 stacking interactions, the total hydrogen bond energy and 
stacking are comparable in strength in our model. 



F Phosphate-phosphate electrostatic interactions 

Phosphate atoms in DNA atoms carry a partial charge close to — e in solution; in our coarse-grained 
model we thus assign a charge — e on each P residue. These charged residues interact by a screened 
Coulomb potential in the presence of ionic solution, which is well approximated by Debye-Huckel 
theory. Ions are treated implicitly, and the potential between charged sites is given by 

V qq {r) = ^ (4) 

Here e = 78 is the dielectric constant of water, and £ D = (e £k B T /2e 2 c) 1 / 2 is the Debye length. e Q is the 
permittivity of free space, and c is the ionic concentration. In water with 200mM ionic concentration 
of KC1, the Debye length £ D w 6.8 A. 



G Determination of local potentials by all-atom simulation 




FIG. 3: Recipe for the determination of bond, angle, and dihedral potentials from all-atom simulations 
(see text). 



For all local potentials on the second line in Eq. (|lj), no functional form is assumed a priori. Instead 
the functional form and the potential parameters are extracted from thermal sampling of equilibrium 
configurations in all-atom simulations. For example, bond angle potentials up to an unimportant 
constant are given by V e g(6) = —k B Tlnp(9), where p{9) is the probability to observe a bond angle of 
9. To obtain statistics that are undistorted by the presence of potential terms already accounted for in 
Eq. ([!]), we designed a modified system which had no base-base interactions, and minimal Coulombic 
interactions between phosphates. Three consecutive bases are taken with center base A, T, G, or C as 
in Fig. |3|A. The end bases are then removed and replaced with a hydrogen of correct bond length to 
each CI' atom (Fig. ||B). This removes any bias in the effective potentials that would have been due to 
base-base interactions, which are already included in Eq. (Q). The molecule is then solvated in a box of 
water such that the water extends 8A beyond the boundary of the simulated molecule on all 6 sides, and 
K + and CI" ions are added at an average concentration of 200mM (Fig. |3|C). In practice this involves 
adding of order 1 ions to the system. Charge neutrality biases the number of ions of each type: there 
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are more positive charges present to balance the negative phosphate backbone. The resulting system 
is then simulated for ~ .250ns for each base, using the NAMD simulation packagetJ with CHARMM27 
potential parameter setc3. This long simulation time is required for the statistics gathered to converge 
within reasonable limits. Bond potentials converge to within 1% of their asymptotic value in 0.02ns, 
single well angle potentials converge in 0.5ns, double well angle potentials converge in 2.5ns, and the 
dihedral potentials converge in 5ns. The longer simulation time was to be certain of the convergence, 
and was not prohibitive due to the very small size of the strand we were simulating. Dihedral potentials 
converge slower: their potentials are shallower resulting in a longer relaxation times, and a larger phase 
space (larger number of atoms) is involved. 

We found that the potentials extracted from all-atom simulations often differed markedly from the 
commonly assumed phenomenological forms for such potentials between the same sets of atoms. In 
the supplemental materials section C, we show representative plots of the statistics-derived potentials 
for the case of bonds lengths (Fig. SI), bond angles (Fig. S2), and dihedral angles (Fig. S3). We use 
A,T,G,C to represent the location of the center of mass of each base. The parameters for all these 
potentials arc given in Tables SVII-SX in SM. 

Wc also found that correlations between the coordinates used for Boltzmann inversion were gener- 
ally small but nonzero, with an RMS value for cross-correlations of « 0.19. Larger cross-correlations 
tended to be between overlapping degrees of freedom such as bond Py S and angle Py SPy . We plot 
the correlation matrix in Fig. S4. 

Almost all bonds show Gaussian statistics consistent with harmonic potentials: Vb on d{i") = hk r (r — 
r ) 2 , however the spring constants k r vary considerably. The stiffest bonds were between the center 
of mass of the bases and the CI' sugar residue. For this bond, pyrimidines, being the smaller bases, 
were stiffer than purines. The effective bond spring constant correlates well with that predicted by 



the phonon dispersion relation for a TD chain of coupled oscillators (Fig. AlS| ). The bond potentials 
between base center of mass and CI' are quite stiff, having natural frequencies of order 0.1 fs _ . Ex- 
plicitly including these would necessitate a prohibitively short time-step in coarse-grained simulations. 
For this reason we treat the base-Cl' system as a single entity, i.e. the base ellipsoid plus CI' residue 
are hard-coded as a rigid object with fixed internal geometry. Because the position of the neighboring 
sugar residue is always the same relative to the base, a force on the sugar induces the same force as 
well as a torque on the base ellipsoid. 

There are 11 different bond angles in the coarse-grained model. Five of these have harmonic 
potentials, V ang i e (9) = ^k$(6 — o ) 2 , with stiffness coefficients kg ranging from about 3 to 19kcal/mol- 
rad 2 . The stiffness of S 5 ,P 5 ,S and SP 3 ,S 3 , are equal within error bars, and might differ only through 
end effects, hence wc take the average. Six of the bond angles were much more readily fit by double 
well potentials: 

V ang i e (9) = -fc B Tln ( e -M0-<M 2 /2fc B T + Ae -k 2 (B-e 3 )' /2k B T\ (5) 

with minimum angles 0\, #2, effective stiffness constants k%, fe, and dimensionless weighting factor A. 
While typical barrier heights for interconversion were about lk B T, the largest barrier for interconversion 
between wells was about 4fc B T, for 6tsp 3 ,- Similar to the bond potentials, the stiffness coefficients 
of the harmonic bond angles show a decreasing trend with mean distance between the participating 



atoms (Fig. A18 ) 



The minima of the parameterized angle potentials compare well with the angles extracted from the. 
atomic coordinates of the standard model of (B,10i,0.338)-DNA derived from the crystal structurcEil 
(Fig. S5) . In cases with double-welled angle potentials, the B-form standard model angles lie between 
the two parameterized wells, closer to the deeper one. Therefore, the parameterization taken from a 
piece of DNA too small to form anything like a double helix actually biases the angle potentials to 
a double helix quite well. An exception was backbone angles, which showed a significant discrepancy 
with those of the B-DNA structure. Other potentials such as stacking may strongly influence the 
equilibrium angle observed for larger strands of DNA, as noted by Tepper and VothtLa, who set their 
equilibrium backbone angle to n. 

There are 4 types of dihedral potential in the model: using the notation X for any of the four bases, 
these are SyPySX, SV Py SPy , PySPySy, and XSPySy (these coordinates can be seen in Fig. |l|A). 
The dihedral potentials for different bases generally have different parameters, which splits the above 
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4 types into 10 distinct dihedral potentials. The dihedral potentials are well-fit by the functional form 

3 

V d ihedral(<t>) = ^ #n(l - COs(?10 - </>„)). (6) 
n=l 

Barriers between local minima on the dihedral potential were all less than about l.5k B T, indicating 
rotations with respect to these degrees of freedom are all quite facile. A comparison of the minima of 
the resulting dihedral angles with those extracted from the crystal structure of B-DNA can be seen in 
Fig. S6. 

There is an improper angle between the normals to the planes defined by PySX and P^SX, where 
X represents the base. This angle is taken to be zero when the Py residue coincides with P51, and 
7r when P31SXP5' all lie in the same plane in the shape of a "Y". Generally the parameters entering 
into this potential were base-dependent (see Table SX), and well fit by Eq. There was generally 
a preferred angle around 0.87T for the purines and ~ 0.5-7T for the pyrimidines, and a large barrier 
(~ 5fc B T for purines and ~ 8k B T for the pyrimidines) that inhibited full rotation. 

We found that in the all-atom simulations bases were not free to rotate: interactions with the rest 
of the molecule biased the orientation of the effective ellipsoid to have a preferred angle. However these 
interactions are already counted in base-P or P-P interactions in Eq. ([!]). so to explicitly include such 
effects in an angle potential representing the orientation of the base would be redundant. Moreover we 
observed that explicitly adding such a base-orientation potential derived from the all-atom statistics for 
3 consecutive bases decorrelates the normal vectors of each base from its neighbors and thus competes 
with stacking order. We thus allowed bases to freely rotate about the Cl'-base bond. 



H Langevin thermostat 

We have adapted the molecular dynamics package LAMMPsEl to simulate the coarse grained DNA 
model. The equations of motion are integrated using a conventional Velocity Verlet method for trans- 
lational and rotational degrees of freedom with a timestep At = 2 — lOfs depending on the degree of 
collapse in the system. To avoid the well-known singularities associated with Eulcr angles, a quaternion 
representation is used to describe the orientation of the ellipsoidsEJ. Since we represent the DNA in 
implicit solvent, we employ a Langevin thermostat to maintain constant temperature. The thermostat 
adds forces F^ L ^ and torques t^ l > to the deterministic forces and torques arising from the interaction 
potentials. In the principal (body-centered) basis, these stochastic forces and torques are given by 

= -tfvi+it (7a) 

rf> = -crv+er*. (7b) 

Here and uii denote the Cartesian components of translational and angular velocities, (f r and Q ot 
are the eigenvalues coefficients of the translational and rotational friction tensors and £' r and £™* 
are components of white noise whose amplitudes are related to the friction coefficients through the 
fluctuation dissipation theorem. 



In the body frame of the ellipsoid, the friction tensor £ ro * is diagonal so that Eq. (7b) for the torques 
can be evaluated directly. The Langevin forces on the translational degrees of freedom in the fixed 
lab frame, however, depend on the orientation of the ellipsoid relative to the direction of motion. On 
timescalcs shorter than a typical rotation time, the diffusion of an ellipsoid is anisotropic, but crosses 
over to an effective isotropic diffusion at longer timesc2. We found it easiest in calculating these forces 
to first project the velocity vector into the body frame of the ellipsoid (where the friction tensor is 
diagonal), evaluate Eq. ( |7a| ) in the body frame, and lastly rotate the resulting force vector F lan9 back 
into the frame of the simulation cell. The same basis transformation was performed in calculating 
torques in Eq. (ffb|). 

The values of the translational and rotational friction coefficients in Eqs. dTa-uq) are taken directly 
from all-atom simulations of the diffusion of a single base solvated in a (20Ay box of water with 
200 mMol KC1. Separate 10 ns simulations are performed for each base. From these results, the 
effective radii of the ellipsoid representing the base can be found by comparing the observed friction 
coefficients with that predicted from continuum hydrodynamic theory It is encouraging that these 



radii are comparable to the energetic equipotential radii at 10fc B T (Fig. A17 ). However it is worth 



Coarse-graining DNA 



9 



noting that the radii extracted from continuum hydrodynamics tend to be smaller than the energetic 
radii, most likely due to a breakdown of the no-slip condition at the interface between bases and water. 

Energy scale for the effective simulation temperature. We express temperature in units 
of a natural energy scale e in the system. We construct this energy scale by taking the minima of the 
RE 2 potential between identical bases given as V m i n in Table SII, and averaging this value over all 
bases A, C, G, T, and orientations x, y, z (an average of the 12 energetic values in Table SII). This 
results in a value of e = 1.45 kcal/mol. 

Sequence definitions. Sequences used in our simulations will be referred to with the following 
convention. We will use C^-Gn to mean a strand of poly(C) of length N which is hydrogen bonded 
to its complementary poly(G) strand of the same length. Ajv-Tjv is similarly defined. We denote by 
HET several heterogeneous sequences, defined in the 5' — ► 3' direction as: 

HET SS ,25 = C AGGATT AATGGCGCCT ACCTT ACC 

HETss^ = C ATCCTCGAC AATCGGAACC AGGAAGGGCC 

HETss,go = HETss,3o - CCGCAACTCTGCCGCGATCGGTGTTCGCCT . 

Finally, HETos,n is a strand of HETss.n that is hydrogen bonded to its complementary strand. 



Ill RESULTS 

A Persistence length of ssDNA and dsDNA 

The persistence length £ p is a measure of the rigidity of a polymer, and is given by the decay constant 
of the backbone unit tangent vector t(s) as a function of base index or position s along the strand: 
(t(s ) -t(s)) = e~ s l^, where (• ■ •} is a thermal average. For ssDNA, the tangent t was calculated by 
taking the vector from the sugar (CI') residue on base i to the sugar on base i + 1, and normalizing to 
unity. For dsDNA, the tangent t was calculated by taking the vector from the midpoint of the sugar 
residues of hydrogen bonded base-pairs at i, to the midpoint of sugar residues of base-pairs at i + 5, 
and then normalizing to unity. The persistence length £ p was then obtained by exponential fits of the 
data to the above correlation function. We found that for dsDNA, if the local principal axis tangent to 
the contour length of DNA (see section 3B) is used instead of the above recipe, the same persistence 
length is obtained to within 2%. 

Without the stabilizing structure of the double helix, single stranded DNA at ionic concentrations 
of 0.04 M has a persistence length on the order 1 nm, which corresponds to 2-3 bases (the distance 
between successive base pairs is approximately 0.4). £ p is chiefly governed by the repulsive Coulomb 
interaction, which tends to straighten out the strand to maximize the distance between phosphate 
residues. As the interaction becomes more screened by the presence of ions in solution, £ p will tend to 
decrease, as shown in Fig. ^. The functional dependence on concentration is captured by a constant 
bare persistence length added to a term inversely proportional to concentration c3. 

The values for l p found in our simulations are generally less than those observed experimentally, 
but it is worth noting that the experimental measurements themselves are highly variable. This can 
be understood from the persistence length being highly sensitive to other factors such as base se- 
quencer and-the experimcntaLsct-up used to measure the persistence length, for example fluorescence 
spectroscopy!^, laser tweezersc3, hairpin loopsci and gel electrophoresis^. 

Fig. |I]A also shows without the RE 2 potential the stiffness of the single-stranded DNA increases. 
This occurs because the RE 2 potential is an attractive interaction, which tends to collapse the strand. 
At the temperatures in our simulation for Fig. |]A (310A"), the stacking and hydrogen bonds are the 
same order as k B T. This temperature is above the dehybridization temperature in our model, and also 
above the unstacking temperature for ssDNA described in more detail below. The consequence of this 
here is that base-base interactions tend to be non-local, involving non-consecutive bases in sequence, 
as can be seen from the snapshot in Fig. |^A. Thus, removing base-base interactions (by setting the 
first term in cq. (Q) to zero) tends to stiffen the strand. On the other hand, at lower temperatures the 
opposite behavior is observed: stiffness does increase with increasing stacking interactions. We study 
this effect in detail below. 

The double helix is inherently more stable to thermal fluctuations than a single strand, as two 
backbones wound around each other provide larger elastic modulus. As can be seen from Fig. ^B, the 
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FIG. 4: (A) Persistence length vs ionic concentration of a random 25bp ssDNA sequence {HETss,2b) a t 
a temperature of 0.42e simulated for 300 ns. Also shown are simulation results for ssDNA with the RE 2 
potential turned off (blue triangles). The remaining data sets are from various experiments: Murphy et 
a/H, Smith et a0, Kuznetsov et acll and Tinland et a£l. (B) Persistence length vs. ionic concentration 
of a 60bp dsDNA sequence (HETpsfio) at a temperature of O.lOe simulated for 600 ns. Results for the 
RE 2 potential scaled to 1/4 strengtbare also shown (blue Xs). Dashed lines in both panels show the 
theoretical model of Nguyen et. al.E3 for the persistence length of a polyelectrolyte, whose functional 
form consists of an intrinsic persistence length plus a term inversely proportional to ionic concentration. 
Insets show representative snapshots taken from the simulations (rendering with BioVECeJ). 

persistence length of the double strand has the same functional dependence on ionic concentration as 
ssDNA, but is roughly 55 times stiffer at 0.02M, 25 times stiffar at 0.04A/ and 52 times stiffer at 0.13M. 
The stiffness ratio from experimental measurements is w 66c2l. Weakening the RE 2 potential has little 
effect on the persistence length. That is, due to the extra stability provided by double-stranded 
hybridization, the double helix shows no collapse on the scale of ~ lOObp (see inset snapshots), so that 
weakening base-base interactions only modestly reduces the stiffness due to stacking. The effect can 
also be seen more evidently from the radius of gyration (Fig. S7). 

The present model predicts a larger persistence length for a homogeneous single-strand of adenine 
(as large as ~ 50 bases or more) than the corresponding homogeneous strand of thymine bases (£ p w 2 
bases) at low temperatures, as seen in Fig. ||(A). These results are consistent with the conclusions of 
Goddard et aLcB, who found larger enthalpic costs for hairpin formation in poly (A) than in poly(T). 
However, at high temperatures the situation is reversed, and ss-poly(A) has a smaller £ P (sa 1.5 bases) 
than ss-poly(T) (l p rs 4 bases), see Fig. ||(B). Adenine, being a purine, has a stronger RE 2 stacking 
interaction (see the z-minima in Table SII), however all A — A, A~P interactions are generally stronger 
than T — T, T — P interactions, and at high temperature this induces a greater degree of collapse due 
to non-local self interactions of DNA strand. The persistence length shows an increasing trend with 
the length of the strand, at the temperatures and ionic concentrations that we studied (Fig. || A and 
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FIG. 5: Persistence length of ssDNA for poly(A) and poly(T) as a function of the number of bases in 
the polymer chain. Simulations were conducted at temperatures of (A) 0.03e and (B) 0.42e at ionic 
concentration of 40 mM for 240 ns. At these temperatures and ionic concentrations the distance between 
stacked bases is about 4A. Note that the temperature in (B) is above the hybridization temperature for 
dsDNA. 

B). This is due to the exaggeration of end effects on shorter strands. The persistence length converged 
to its infinite length value for stands longer than about 7£ p at high temperature. 
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FIG. 6: Persistence length and radius of gyration vs. temperature for ssDNA sequence HET$s,25 at 
ionic concentration 0.04 mol/L. Simulations were run for 240 ns. 
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At moderate to high temperatures, we observe a heterogeneous strand to collapse as we lower the 
temperature, as seen in Fig. ||. We used radius of gyration R g to monitor the overall degree of collapse 
of the DNA, which is given by 

1 N 

fc=i 

where r avg = N" 1 J2k=i r k - an d the angle brackets represent the thermal average. Small values of R g 
correspond to collapsed states. The general increase of persistence length and radius of gyration in the 
model with temperature contrasts with the temperature dependence of a worm- like chain (£ P ~ T _1 ). 
At higher temperatures thermodynamic states with larger entropy have larger weight, the polymer 
expands and the self-interactions which reduced the persistence length become less important. The 
collapse temperature where the radius of gyration suddenly increases is « 0.4e (see Fig. [jj). A similar 
trend towards collapsed states with increasing ionic concentration can be seen in Fig. S7. 
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FIG. 7: Radius of gyration of ssDNA for poly(A) and poly(T) as a function of the number of bases 
obtained from the same set of simulations shown in Fig. || (i.e. c = 40mM (A) T = 0.03e, (B) T = 0.42e). 
R g exhibits a power law scaling with the number of bases with exponents of (A) 0.54 ± 0.08 for poly(T), 
0.84 ± 0.01 for poly(A), and (B) 0.80 ± 0.02 for poly(T) and 0.45 ± 0.07 for poly(A), respectively. Insets 
show representative snapshots taken from the simulations^! 

The temperature dependence of persistence length and radius of gyration is not a simple monotonic 
function. Its complexity is seen by comparing Figs. || and 0. The RE 2 interaction in the adenine bases 
is strong enough to cause a well stacked configuration at low temperatures, whereas for thymine bases, 
the stacking interaction is too weak (Fig. |t]A). At higher temperatures, Fig. [?|b shows that the stronger 
RE 2 interaction causes collapse, a fact confirmed by Fig. |^A. 

A slightly stronger or weaker van der Waals potential between bases results in DNA that is either 
collapsed or expanded respectively. Our parameterized DNA is poised between an expanded and col- 



Coarse-graining DNA 



13 



lapsed state (see Fig. A19| ), so the actual state of DNA would be highly sensitive to conditions affecting 
base-base interactions. In vivo mechanisms for modulating base-base interactions include DNA methy- 
lation, or potentially nucleosome post-translational modifications utilized for gene regulation such as 
histonc phosphorylation, acetylation, or methylationlfj. 



B Twist and Stacking of dsDNA 




FIG. 8: (A) Calculation of local helical axis and twist (see text). (B) Calculation of the degree of stacking, 
or stacking fraction. Principal axes are found for each ellipse and a plane normal to the z axis is taken 
through the center of the base as shown in the bottom figure. This defines an ellipse for each base. The 
stacking is defined by projecting ellipse i + 1 onto ellipse i and vice versa, and measuring the overlap (see 
text). The top of (B) shows a visualization of this projection. 

The twist is defined as the average angle that the backbone of the double helix rotates about the 
helical axis for each successive base pair. Fig. ^A visualizes the calculation of the local twist at position 
i along the DNA. To obtain the helical axis at a given position along the double helix, we take the 
positions of the two sugars opposite the hydrogen-bonded bases at that position, as well as the sugar 
pairs up to three bases above and below that position. From the sugar coordinates, we compute the 
principal moments of inertia, and take the moment of least rotational inertia to be the helical axis. 
This method fails if the double helix persistence length drops to be on the order of three base pairs or 
if the helix dchybridizes. However, neither of these scenarios occur for simulation parameters giving a 
stable double-helix. The sugar-sugar vector rotates around the helical axis as one proceeds along the 
bases; the angle between the ith and i + 1th sugar-sugar vectors is the local twist. This quantity is then 
averaged along the strand as well as over time. Snapshots during the simulation give the average twist 
reported in the figures below. We can similarly define the pitch of the DNA as pitch = 2-7r/twist, i.e. 
the number of base pairs one must traverse for a fulLrcvolution of the helix. Note that the observed 
crystal structure value for the pitch of B-DNA is ICO, giving an expected twist of 36°. 

We also develop an order parameter to define how well a given base is stacked to its neighbors. 
Taking the dot product of z principal axes of the ellipsoids has translational symmetry in the x — y 
plane of cither ellipsoid, and so does not capture the concept of stacking. The method we employ 
instead uses area projections, depicted in Fig. ||B. For each ellipsoid, we take its cross-section of the 
ellipsoid in the x — y plane of its own principal axes, which is an ellipse. To calculate the stacking 
fraction between bases i and i + 1, we project cross-section i + 1 down onto cross-section i. We then 
take the average of this value with the equivalent projection of i onto i + 1 in order that our definition 
of stacking be symmetric. We divide the projected area by the area that would be obtained from the 
B-DNA crystal structure (a stacking fraction of 0.6) to properly normalize. Because of twist, bases are 
not perfectly stacked in the crystal structure, so it is possible to see stacking fractions greater than 1. 
This definition of stacking mathematically represents the intuitive notion of stacking very well, namely 
the degree to which the flat parts of the two objects overlap their areas. The stacking fraction is a 
geometrical order parameter, but correlates strongly with the RE 2 energies between neighboring bases 
on either of the two strands of the DNA (see Fig. S8) . Thus stacking fraction accurately captures the 
base-base van der Waals stacking energies in addition to quantifying the structural features of DNA. 

Wc found base stacking to be very heterogeneous, with traces of periodicity along the strand having 
a period of roughly four to five bases (see Fig. ||). Bases seem to stack well in small groups, at the 
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FIG. 9: Time averaged stacking fractions of homogeneous strands in a helix with their complementary 
strand. Figs. A and B are obtained from simulations of 60 base dsDNA, consisting of a homogeneous 
strand of poly (A) hydrogen bonded to a similar strand of poly(T). The figures show the stacking of the 
adenine bases with themselves and the thymine bases with themselves respectively. Figs. C and D are 
obtained similarly to A and B except with guanine and cytosine bases. All simulations were conducted 
at T = 0.07e and an ionic concentration of 0.04 mol/L. The total simulation time is 800ns. The average 
stacking fraction for each of these simulations is 0.1889 ± 0.0185 for adenine, 0.3178 ± 0.0343 for thymine, 
0.3065 ± 0.0099 for guanine and 0.1593 ± 0.0271 for cytosine. 
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FIG. 10: Time averaged stacking fraction of the bases along a strand for both the all-atom parameterized 
potentials and another simulation with weakened parameters. Here the bond potentials were weakened by 
a factor of 0.5, the angle potentials by 0.1 and the dihedral potentials were disabled. This indicates that 
the decimated stacking pattern we observe for bases is due to frustration between stacking interactions and 
the other potentials in the model (bond, angle, dihedral). These simulations were done with HETds,30 
at an ionic concentration of 0.04 mol/L and a temperature of 0.07e. 

expense of poorer stacking in bases very nearby along the strand. This results in kinks in the stacking 
structure of the strand, with the distance between kinks being only a few base pairs. An interesting 
trend is that the purines, the larger bases with greater stacking interactions, fluctuate far less and 
show more consistent stacking. On the other hand pyrimidines show larger extremes in their decimated 
stacking pattern, stacking intermittently more strongly than purines, and completely unstacking. The 
stronger stacking interactions of the purines apparently induce convergence to the average value. The 
decimation pattern that we observe for the bases is due to frustration between stacking interactions 
and the other potentials (bond, angle, dihedral), and reducing these other potentials resulted in less 
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fluctuation, and a larger average stacking (see Fig. [h]). This situation is reminiscent of a Frenkel- 
Kontorova model where a competition between two incommensurate length scales corresponding to 
the equilibrium separation of a ID chain .of oscillators and a periodic underlying potential results in 
frustration-induced domains of oscillatorscll. 

To understand which parts of the Hamiltonian control the large scale properties of twist and stacking 
of dsDNA. we applied global multiplicative scaling factors to the energies of individual potential classes, 
such as RE 2 , bond, angles, dihedrals and hydrogen bonds. Our simulations show that the potential 
most sensitively affecting the structure of the double helix is the base-base RE 2 interaction. Increasing 
the base-base RE 2 potential reduces the contour length of the helix, (see Fig. S9), increases the stacking 
fraction, and tightens the twist over a large range of interaction strength. 



A 



(U 
CD 

33 

-t— I 
CO 



c 
o 



o 

CO 



c 
o 
CO 



40 
35 
30 
25 
20 
15 
10 
5 



I I 

+ 

+ 

+ 


I 

+ 


1 

+ 


1 1 

+ 4 


+ 






+ 


+ 

+ 




+ 












I I 


1 


1 


1 1 




Multiplicative RE factor 



FIG. 11: Effect of scaling the RE 2 energies on (A) twist and (B) stacking of double stranded DNA for 
a random sequence (HETds,3o) at zero temperature and 40 mM ionic concentration. Simulations were 
started from the standard model (B,10i,0.338)-DNA structured and allowed to equilibrate by energy 
minimization to structures having the values shown at T = 0. The equilibration time was about 80ns, 
implying shallow energetic gradients of collective modes. Insets show representative snapshots taken from 
the simulations^. 

Perhaps surprisingly, we found that the base-base attraction also appears to induce twist of the 
double helix. A picture that has emerged from computational models of DNA structures is that 
twist results from the competition between electrostatic repulsion of phosphate groups and favorable 
base stacking interactions, with stronger stacking interactions favoring alignment of the bases and thus 
putativcly tending to straighten the helix. Indeed, the van der Waals-like RE 2 potential is minimized 
when the relative twist between stacked bases is zero. Fig. [Ii|A plots the helical twist as a function 
of RE 2 interaction strength which modulates stacking interactions. This clearly shows an increase in 
twist with stacking interaction strength over a large range of the RE 2 interaction. These results were 
obtained by a zero temperature energy minimization, starting from the expected crystal structure 
of B-DNA. Finite temperature reduces the overall values but does not change the trend. During 
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the simulations, the helical twist relaxed to a degree determined by the overall strength of the RE 2 
potentials. At the all-atom parameterized values the twist was 21.2°, whereas in the crystal structure 
it is 36.0°. Despite this quantitative discrepancy we did find that the potential function (Eq. (Q)) 
reproduced a double-stranded structure with major and minor grooves (Fig. |l2| ). The ratio of the sizes 
of the grooves minor to major in the coarse-grained model was 0.64, as compared to the experimental 
number of 0.54. Major and minor grooves persist so long as the twist of the DNA is > 10°. 
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FIG. 12: The size of the major and minor grooves as a function of position along the double helix. This 



is a finite temperature simulation of ionic concentration 0.04 mol/L as in Fig. 15. The errors are of the 



order of the symbol size. The double helix makes about two turns over the length of the strand. 

The ladder conformation is preferred when the RE 2 potential is scaled down to zero, where Coulom- 
bic interactions are competing solely with local bond, angle, and dihedral potentials. The twisted 
conf orma tions of the helix bring the phosphate residues closer together rather than farther apart (see 
Fig. A20|) . The direct compression of a ladder configuration would also increase the Coulomb energy 



in inverse proportion to the contour length, as well as frustrating local potentials such as angular 
potentials (see Fig. fl3| and description below). To avoid this energetic cost as DNA is compressed, 
the system can lower its energy by structurally relaxing into a helical conformation. Upon helix for- 
mation, the potential energy of terms such as angles and coulomb lower more than the RE 2 potential 
energy raises due to shearing the stacking pattern. Put another way, the stacking interactions do not 
favor the twisted conformation, they favor proximity of the bases: the total RE 2 energy in a helical 
B-DNA conformation of HETds,30 is about —205 kcal/mol, while if the twist is set to zero from this 
conformation by forcing a ladder initial condition, the RE 2 energy is —270 kcal/mol before relaxation. 
However, the other potentials favor helix formation more so than maintaining the ladder, and so break 
translational symmetry along the DNA contour when forced into proximity by the RE 2 potential. 
The correlation between RE 2 energy and stacking geometry (see Fig. S8) along with the competition 
between potentials in the system together imply that as the strength of the RE 2 interaction continues 
to increase, the system must eventually favor ladder-like configurations as bases are more properly 
stacked. This can be seen in figures [ll]A,B in the range of multiplicative RE 2 factor from 5 to 10. The 
twist begins to decrease (albeit with large scatter in the equilibration data) as the stacking fraction 
continues to monotonically increase above values present in the crystal structure. 

The dominant interaction governing the stacking fraction is the also strength of the base-base RE 2 
potential. There was significant relaxation of the B-DNA model structure at the all-atom parameteri- 
zation values. The twist and stacking fraction obtain their B-DNA model values only when base-base 
interactions are magnified by a factor of 5, as can be seen from Fig. [Ti]b. This may indicate that 
cooperative many-body effects beyond the superposition of pairwise Lennard-Jones interactions are 
governing the stacking interactions between bases. It is worth noting that large dynamical fluctuations 
in the DNA structure are also observed in all-atom simulations!^. 

Increasing the stacking energy (via the RE 2 potential) increases the twist, and this effect frustrates 
nearly all other interaction energies in the system. Fig. [L3| plots the other 5 contributions to the energy 
as a function of the induced helical twist due to increasing the RE 2 energy. The values of the twist were 
taken from the monotonic relationship in Fig. [Tl[A. We found RE 2 energy to be the optimal parameter 
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FIG. 13: Behavior of energy scales as functions of helical twist, as induced by increasing stacking inter- 



actions. Data plotted here are taken from the same simulations as in Fig. 11. Energies have been shifted 



to start from zero to depict the relative effects of increasing twist more clearly. 

to vary the helicity in the subsequent simulations. The increased twist of the double helix frustrates 
the Coulomb and angle potentials most significantly. As the twist of the helix increases, the energy 
of the angle potentials increases, indicating that the all-atom parameterized angle potentials favor a 
ladder configuration. Thus the base-base stacking interactions, and not the angle potentials, govern 
the tendency to helix formation, a result in agreement with previous coarse-grained studieslI3. 



C Increasing temperature reduces twist and stacking in dsDNA in a sequence- 
dependent fashion 

Fig. |l4|A shows the twist as a function of temperature. The sequence Gsq-Cqo maintains a twist of 
approximately 29° until T = 0.05e at which point it drops sharply in a manner suggestive of a phase 
transition, accompanied by large fluctuations manifested by a sudden increase in the statistical error 
at T = 0.05e. The helicity is approximately zero for higher temperatures. The stacking fraction for 
Geo-Ceo shows similar behavior than the twist (Fig. [T^B), with phase transition around T = 0.05e. 
This transition involves a reduction of order within the double-stranded structure, and is distinct 
from dchybridization (the HETos,60 double strand was observed to dehybridize at a temperature 
around 0.31e, and took about 330 ns to dehybridize completely). Above the "untwisting" transition 
temperature, the stacking is still appreciable, although the twist does not appear to be significant. 
We found that Aqq-Tqq maintains very little twist in our simulations. At zero temperature it has a 
weak twist of 12°. This sequence likewise showed the least stacking order among the sequences we 
studied. The sequence HET^sfio has much smoother behavior with changing temperature. At zero 
temperature, it has a twist of 26°, slightly less than homogeneous G-C DNA, and intermediate stacking 
fraction to G-C and A-T. The phase transition behavior present in Geo-Ceo is smoothed out for the 
heterogeneous sequence, yet it curiously maintains helicity until a higher temperature. Stacking is 
intermediate to G-C and A-T sequences at all temperatures, and also has a broadened transition. 



D Coulomb interactions oppose both twist and stacking in dsDNA 



Fig. 15 shows the twist and stacking of the HETosfio sequence as a function of ionic concentration. 
Increasing ionic concentration reduces the repulsive Coulomb energy by decreasing the Debye length 
in Eq. (0). As phosphate backbone charges are more effectively screened in stronger ionic solution, 
the RE potential starts to dominate over the Coulomb potential, which reduces the contour length. 
To minimize frustration of the other potentials upon this compression, the helix twists to compensate. 
Thus an increase in ionic concentration increases the helical twist in an indirect way (see Fig. |J|A). 
The effect saturates as the ionic concentration is increased, whereupon other potentials such as angle 
and dihedral eventually dominate over the Coulomb potential. 

Fig. [i5|b shows that the stacking fraction increases with ionic concentration. This can be understood 
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FIG. 14: Behavior of twist and stacking versus temperature. This is a simulation of sequences Ggrj- 
CgOi Ago-Tee and HETpsfio simulated at an ionic concentration of 0.04 mol/L for 800 ns. Insets show 
representative snapshots taken from the simulations!^. 

from the interpretation of Fig. where it is seen that the Coulomb energy opposes stacking, so that 
ameliorating it by increasing ionic concentration would increase stacking propensity. As ionic strength 
is increased, the DNA strand compresses due to the weakening of Coulomb repulsion. Thermal motion 
is less effective at eliminating stacking in this more compressed state, due to the steric constraints 
manifested in the RE 2 potential. Finite and zero temperature simulations in Fig. [l5] obey the same 
trend. 

E The model shows chiral preference for the right-handed helix 

To address the question of whether the present physicochemical based model exhibits chirality, we 
performed finite temperature simulations of HETds ^q DNA starting from two different initial con- 
ditions: one from the (right-handed) B-DNA standard structure, and another from a structure with 
the azimuthal angle between successive base pairs reversed from +36° to —36°. The latter gives an 
" anti-B-form" DNA with left-handed helix as the initial condition. Each configuration was observed to 
relax to a metastable equilibrium conformation for the duration of the simulation, analogous to either 
B-form or Z-form DNA. 

The difference of the thermal average potential energy (Eq. ([!])) between the right-handed and left- 
handed forms of DNA is plotted in Fig. |l6| as a function of the strength of the dihedral potential. The 
strength of the dihedral potentials are all simultaneously varied by adjusting an overall multiplicative 
factor. We chose to vary the dihedral potential because it is an obvious chiral term in the model's 
potential function. Specifically, the following dihedral potentials show bias toward the right-handed 
(B-DNA) helix (with the strength of the bias given in brackets in units of kcal/mol): S5<P 5 /SP3< (0.25), 
P 5 /SP 3 'S 3 / (0.06), S 5 >P 5 ,SA (0.40), S5/P5/SC (0.18), S 5 T 5 ,SG (0.62). The potential S 5 ,P 5 >ST biases 
toward a left-handed (Z-DNA) helix, but at a strength of only 0.03 kcal/mol. The remaining potentials 
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FIG. 15: Twist and stacking of double stranded DNA vs ionic concentration. Thermal averages at finite 
T are taken from a 600ns simulation of HET^sfiO at T = O.lOe. Zero temperature data is taken from an 
80ns simulation of HETpsfio, and data from the equilibrated state is shown. 

are chiral symmetric. 

The model shows an energetic bias towards the right-handed B-form of DNA, which at the temper- 
ature of the simulation was about 0.67 kcal/mol-base pair, for values of the dihedral potential obtained 
from all-atom parameterization. Interestingly, the chiral bias towards right-handed DNA is maximal 
at this value of dihedral strength. Fig. |l6| also shows a chiral preference even in the absence of dihedral 
potentials. This is evidence that the origin of handedness in the model results from a coupling between 
potentials that individually are achiral, but collectively these potentials result in chiral constituents 
that when coupled together (e.g. by stacking interactions) yield a preference for the right-handed 
helix. We elaborate on this further in the discussion section, but note for now that the structure of the 
Phosphate-Sugar-Base moieties which constitute the building blocks of DNA are themselves achiral 
objects by virtue of their absence of a center of inversion or mirror plane. 

Experimental measurements of the preference of B-DNA over Z-DNA give a free energetic difference 
of about 0.33 kcal/mol-bpE9. The above energetic difference of 0.67 kcal/mol-bp in the computational 
model does not account for entropic differences between the B and Z forms. If one takes the magnitude 
of the value in the model seriously it would imply that the Z form has larger entropy than the B form. 



As is evident from the snapshots in the inset of Fig. 16, the stiffness of Z-DNA was seen to be 
softer than that of the B-form, with persistence length £ P reduced by about a factor of 2 at the all- 
atom parameterized values. Electron microscopy measuremente_^of chain flexibility in Z-DNA have 
shown moderate increases of persistence length l P of about 30%Eil, however these measurements were 
for poly(dG-dC) that had been adsorbed onto a 2-D surface which could introduce new interactions 
modifying l P . Since a larger £ P must correspond to stronger stabilizing interactions whose effect 
would be to increase the bending modulus, one would expect from energetic arguments that the 
thcrmodynamically less stable Z-form of DNA would have a shorter persistence length, as we observed 
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in our simulations. On the other hand the increased number of base pairs per turn in the crystal 
structures implies a larger stacking fraction in Z-DNA, however only by about 10%. Moreover the rise 
per base pair is almost 50% larger in Z-DNA, implying reduced base-stacking interactions which would 
also reduce the stiffness. Reconciling the computational and experimental observations of persistence 
length of Z-DNA is a topic for future study. 




0.5 1 1.5 2 2.5 3 
Dihedral Multiplicative Factor 



FIG. 16: Difference in the thermal average potential energy between right-handed and left-handed forms 
of ds-DNA (sequence HETds,3o), as a function of dihedral strength. Thermal averages are taken from 
160ns simulations at T = 0.03e. Inset images show representative snapshots from the simulations at 
dihedral multiplicative factor unityej. 



IV CONCLUSIONS AND DISCUSSION 

In this paper, we have introduced a coarse-grained model of DNA, using rigid-body ellipsoids to model 
the stereochemistry of bases. This model captures the steric effects of base-stacking, the stability 
of base-pairing hydrogen bonds, screened Coulombic repulsion of the phosphate backbone, nonlocal 
interactions between base-base and base-backbone, as well as backbone rigidity due to bond, angle 
and dihedral potentials. Local effective potentials along the backbone are obtained from the statistics 
of all-atom simulations in explicit solvent. Base-base and base-backbone interactions are obtained 
from best fit between van dfir Waals interactions in an all-atom model and an anisotropic potential 
between effective cllipsoidstH. Hydrogen bonds are modeled by adapting a functional form used in 
all-atom simulations to ellipsoidal bases, and phosphate-phosphate interactions are modeled through 
a mean-field screened Coulomb potential with Dcbyc length dependent on ionic concentration. 

The radii along the principal axes of the bases as defined by cquipotcntials of the anisotropic energy 
function correlate well with the base radii as determined from hydrodynamic diffusion measurements 
in all-atom simulations. The stiffness constants in the bond and angle potentials correlate well with 
that predicted by the phonon dispersion relation for a 1-D chain of harmonic oscillators. 

The model is physico-chemistry based and uses no structural information (i.e. no Go potentialsE3) 
to provide bias towards the DNA crystal structure. The potentials result in a stable double-stranded 
helix with both major and minor grooves, and a persistence length for single- and double-stranded 
DNA comparable to experimental values. 

We have introduced quantitative recipes appropriate for the dynamical trajectories in molecular 
dynamics to quantify structural order parameters in the system such as degree of stacking and amount 
of twist. We investigated these structural properties along with other quantities such as persistence 
length, radius of gyration, and chirality, for both single- and double-stranded DNA where appropriate, 
as various environmental factors such as temperature and ionic concentration were varied. We also 
investigated structural order in ss- and ds-DNA as internal parameters such as number of bases, base 
sequence, and stacking strength were varied. 
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We find that at lower temperatures in the model, ss poly(A) stacks significantly more strongly than 
ss poly(T), consistent with the conclusions from the Libchaber groupLJ. However, at higher temper- 
atures another regime is reached where non-local interactions between bases govern the persistence 
length: poly(A) forms a collapsed globule with shorter persistence length than poly(T) which forms 
a more expanded globule. The scaling exponents for the radius of gyration likewise show inverse be- 
haviors in these two temperature regimes. The persistence length of ss-DNA initially decreases with 
increasing temperature in accord with the worm-like chain model, however at higher temperatures 
where non-local interactions become important, the persistence length shows an increasing trend over 
a large range of temperature, while the radius of gyration of the DNA globule expands through a 
collapse-transition temperature. In other words, below the crossover temperature, stacking interac- 
tions stiffen the chain and increase the persistence length, while above it nonlocal base-base van der 
Waals interactions (which are inseparable from stacking interactions) soften the chain and decrease £ P . 

We also investigated the interplay of forces that results in twist for ds-DNA. We find that under 
typical conditions, base-stacking interactions are the dominant factor in driving twist, in spite of the 
fact that a ladder configuration would minimize base-stacking energy. Increasing the putative base- 
stacking strength frustrates all other interactions in the system as twist increases, indicating no other 
interaction could be responsible for inducing twist. However, the RE 2 potential responsible for base- 
stacking is achiral, and energetically minimized when bases are stacked directly on top of each other. 
We thus infer that base stacking enhances the chiral properties of the constituent components in DNA 
by bringing them in close proximity, resulting in increased twist. Moreover, when base interactions 
are sufficiently strong (~ 5-10X their putative value), bases eventually stack more directly on top of 
each other at the expense of twist. This results in non-monotonic behavior of the twist as a function 
of stacking strength. Both twist and stacking increase as Coulomb interactions are more effectively 
screened. Even in the native thermodynamically stable structure, DNA is under stress and thus 
strained due to competition between the various potentials. 

In our model, the structure in hybridized poly(G)-poly(C) shows different temperature dependence 
than poly(A)-poly(T), with Ggo-Ceo showing much more order as temperature is raised, along with a 
sudden first-ordcr-likc drop in twist and stacking at a transition temperature below the dchybridization 
temperature. The untwisting with temperature is smoothed out for the heterogeneous sequence, in a 
similar manner to disorder-induced broadening of a phase transition. It is notable how sensitive the 
qualitative behavior is to the sequence: A-T bases differ only mildly from their G-C counterparts in 
size and energy scales, yet these differences are enough to determine the presence or absence of critical 
behavior with respect to the order parameters of helical twist and base pair stacking. 

Base stacking was analyzed at the level of resolution of individual bases, where stacking was found 
to exhibit an intermittent, decimated pattern wherein roughly four to five bases stack well in groups at 
the expense of poor stacking in nearby bases. This quasi-periodic structure is reminiscent of systems 
frustrated by incommensurate length scales such as the Frenkel-Kontorova model. Consistent with this 
notion, decreasing the putative strength of bond, angle, and dihedral interactions in the model resulted 
in less stacking heterogeneity, and an increase in the degree of stacking. The heterogeneous stacking 
pattern was observed over the total simulation time of about a [is. One would anticipate that over 
longer time scales the specific heterogeneous pattern would shift to other metastable configurations. 

Including anisotropic van der Waals interactions through the RE 2 potential introduces a large 
number of parameters in the model, so that the reduction in total number of parameters from that 
required in all-atom simulations is not dramatic. For each of the 10 base-base interactions, there are 
6 radii, 6 well-depths, an overall energy scale A12 and an atomic length scale er c , for a total of 140 
parameters. Base-backbone potentials introduce another 80 parameters. Including masses, bonds, 
angles, dihedrals, screened electrostatics, Langevin coefficients, phosphate charge, and temperature 
gives a total of 382 parameters. To simulate the same system using all atom potentials requires 
at least 600 parameters including van der Waals parameters, bonds, angles, dihedrals, and atomic 
properties such as mass, charge and diffusion coefficient (this number can increase to thousands if 
accurate van der Waals potentials are sought). 

On the other hand, the number of degrees of freedom in the coarse grained model is substantially 
reduced. Each base-sugar-phosphate residue has only 9 degrees of freedom in our model once rigid 
constraints are accounted for, whereas the same residue has approximately 100 degrees of freedom 
in the all-atom model, if hydrogen atoms are treated as rigidly bonded. Implicit solvent can be 
present in both coarse-grained and all-atom scenarios. The ten- fold reduction in the number of degrees 
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of freedom allows the coarse-grained model to explore longer time scale phenomena than would be 
practically obtainable with all- atom simulations. 

Base-base interactions had to be strengthened to reproduce the properties of the crystal structure. 
Since the putative strength of the base-base interactions resulted from direct all-atom parameteriza- 
tion, this implies that the functional form of the RE 2 potentials may not fully capture the electron 
correlations governing stacking interactions. These polarization effects likely induce a many-body co- 
operative component to the stacking interaction for coarse-grained bases, resulting in a much stiffcr 
potential surface for local fluctuations around the native structure. Similarly, many-body interactions 
between coarse-grained residues in proteins were necessary to effectively capture protein folding rates 
and mechanismsEj. 

We found that the physicochemical based model showed energetic bias towards towards a right 
handed form of ds-DNA helix over a left handed form. The more stable right handed form had longer 
persistence length. While the dihedral potentials (as determined from the equilibrium sampling of a 
short piece of ss-DNA) yield potentials that break chiral symmetry towards the right-handed helix, their 
role in determining chirality is sufficiently coupled to other interactions in the system that strengthening 
the dihedral potential alone does not enhance chirality. The chiral energetic bias was largest for the 
all-atom parameterized values of the dihedral strength, i.e. while decreasing the overall strength of the 
dihedral potentials diminished chiral preference in the model, increasing dihedral strength also did not 
enhance chirality, but instead diminished it. This effect is likely due to an interplay between stacking 
and dihedral interactions. It appears that like twist, chirality is induced not so much by the direct 
inherent energetic preferences of potentials, but by an indirect minimization of frustration induced by 
the forced compression of the system due to base-base stacking interactions. 

That is, preference for a right-handed helix over the left-handed helix for ds-DNA in our model 
arises predominantly from the stacking of chiral constituents: each phosphate-sugar-base-phosphatc 
constituent is a non-planar object that cannot be transformed by rotations and translations into its 
mirror image. Chirality is enforced by asymmetries in the equilibrium values of the bond and angle 
potentials which determine the minimum energy structure of these molecular constituents. Specifically, 
the molecular building block P3'-S-Basc-P5/ has sugar residue at the chiral center, and there is no center 
of inversion or mirror plane. The subsequent building block is constrained to have its P3' residue at 
the position of the previous block's P5' residue, and is coupled to the previous building block through 
base-stacking interactions. In this way, the handedness of DNA is induced by coupling these chiral 
objects together through stacking interactions, analogous to the mechanism behind the right-handed 
preference of a-helices due to hydrogen-bond-mediated coupling between chiral L-form amino acidsLj. 
Phospholipid-nucleoside conjugates have also been observed to exhibit spontaneous right-handed helix 
formation, with no helical preference present for the conjugate with nucleic acid bases removedLj. This 
again reinforces the idea that helicity can be induced by coupling chiral constituents together through 
an achiral force. Such spontaneous formation of handed helical structures from chiral ingredients is also 
reminiscent of the mechanism by which chiral nematogens, interacting through-simple Lennard-Jones- 
like potentials, form a cholesteric (twisted nematic) phase in a liquid crystalES. Exploration of the 
origins of DNA handedness including C-G sequence preference and nucleotide pairing as in Z-DNA, as 
well as more refined structural studies of the connection between atomistic and coarse grained models 
in the context of chirality, are topics for future work. 
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AI APPENDIX 

A The RE 2 potential between ellipsoids 

The RE 2 potential, developed by Babadi et a/.IEl, is a generalization of the ubiquitous Lcnnard-Jones 
interaction to ellipsoidal particles. The extra rotational degrees of freedom accounted for by the RE 2 
potential come with a greater computational burden: the RE 2 potential needs five times the number 
of input parameters per interaction pair (including the cutoff distance) as does the Lennard- Jones 
potential. 

The first six of these parameters come from the radii of the two ellipsoids, expressed as the following 
shape tensor: 



S i =diag(a«,a«, ( rW) 



(A. 



The anisotropic well depths are expressed as the relative potential well depth tensor. Its entries 
arc dimcnsionlcss and arc inversely proportional to the well depths in their respective directions. 



E^diagteMeMeM) 



(A.9) 



Another input parameter with a dimension of distance is the atomic interaction radius, cr c , which 
characterizes the distance scale for interactions between the atomic constituents to be coarse-grained. 
The energy scale is given by the Hamakcr constant, Aij . Finally, there is the necessary interaction cutoff 
distance, R cu ^, as the computation time to compute all 0(N 2 ) RE 2 interactions is too prohibitive, and 
is unnecessary because of how quickly the potential decays with separation. 

The RE 2 potential between two ellipsoids, labelled as i = 1,2, can be conveniently written in terms 
of attractive and repulsive components. In the expressions below, the tensor Aj is the rotation matrix 
from the lab frame to the rotated principal axis frame of particle i. 
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The quantities 7712, X12 an d h\2 in Eqs. (|A.1C| - |A.11 ) are defined by 
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G = Af SfAi + A^S^Aa 

In LAMMPS, an RE 2 -likc interaction between an ellipsoid and a sphere may be specified by making 
the radii parameters of the sphere zero. The radius of the sphere is given by er c . In this case, the 
RE 2 interaction between the objects is calculated in the limit that S2 — > and A12 — > 00 at a rate 
of A12 ~ 1/ det(S2). The relevant energy parameter for this interaction we will call A12 = A^/pcr^, 
where p is the number density of the sphere. That is, |7rdet(S2) / o = 1. The potential may then be 
straightforwardly evaluated by substituting A 2 = I and S 2 = into the RE 2 potential: 



Vk(Ai,I, ri 2 ) = (i + 3xia fL) n ( ^ , ) (A-12) 

il2 4cr3 60 / CT C \6/. .45 cr c \ -n- / 



Vr(Ai, I, ri 2 ) = ^ 1 + S^ II (D ; r (A.13) 

2025 3tt /i 12 \h 12 J V 56 /i 12 / JLA^ + ft^/eos / 

A more complete discussion of the RE 2 potential including its advantages over the alternative 
biaxial Gay-Berne potentialEll is discussed in the literaturel2ZI. 



B Energetic and Hydrodynamic comparison. 

For our Langcvin simulations, we directly use the friction eigenvalues extracted from all-atom simu- 
lations. However, it is instructive to compare the effective size of the bases both energetically and 
hydrodynamically. 

The shape-dependent friction coefficients of ellipsoidal bodies can be obtained from the low Reynolds 
number solution to the Navier-Stokes hydrodynamics equationsE3, and can be expressed in terms of 
the three radii in the principal axes: 

C (t) = (14a) 
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To verify that the continuum hydrodynamic expressions for the friction tensor faithfully reproduce 
the diffusive motion of a base, we extracted diffusion coefficients from all atom simulations using the 
CHARMM potential. The all atom simulations were performed with a single base solvated in a (20 
A) 3 box of water in a 0.2 mol/L neutral KC1 solution. The simulations were performed for 10 ns in 
the NPT ensemble. Four such simulations were performed, one for each type of base. Temperature 
was maintained at 31 OK by Langcvin dynamics and pressure was maintained at 1 atm. 



Hydrodynamic radii are found by best fit of Eqs. (14E-b) to the observed effective friction coefficients 
for both rotation and translation in all-atom simulations. Tabulated values of friction coefficients and 
effective hydrodynamic radii are shown in Table SVI. These hydrodynamic radii may then be compared 
with energetic radii (for our Langevin simulations, we directly use the friction eigenvalues extracted 
from all-atom simulations rather than the parameters resulting from the best fit to the hydrodynamics 
of an ellipsoid). 



A plot of hydrodynamic radii vs. energetic radii is shown in Fig. A17. The two measures compare 
well, however the hydrodynamic size tends to be smaller than the energetic size. Either the param- 
eterization scheme we employed resulted in RE 2 potentials that overestimated the range of repulsive 
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forces, or the smaller hydrodynamic values may be the result of the breakdown of the no-slip condition 
at the interface between bases and water, a reasonable scenario given that the size scale of the bases 
(~ A) tests the limits of the macroscopic assumptions in continuum hydrodynamics. 
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FIG. A17: Hydrodynamic vs. energetic radii for effective ellipsoids. Hydrodynamic radii are found by 
optimizing the measured friction coefficients extracted from all-atom simulations of isolated bases and 
fitting to Eqs. ( |14a| -b) for the rotational and translational friction of an ellipsoid. The values are strongly 
correlated, however hydrodynamically-derived radii tend to be smaller than energetically-derived radii. 
For the thinnest axis, the assumption of continuum hydrodynamics is expe cted to be least accurate, and 
relative modifications due to hydration effects are expected to be largest^' c^ 6 H I 6 2 IH. 
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FIG. A18: (A) Stiffness coefficient k r for bond potentials as a function of the mean distance between the 
participating residues, along with the phonon dispersion relation for a 1-D chain of harmonic oscillators^!, 
k e jf(X) = k Q sin 2 (7ra/A), with wavelength A here taken to be the distance r. The correlation coefficient 
is r = 0.995 and chance probability p ~ 4 x 10 -5 . (B) Stiffness coefficient kg for angle potentials as a 
function of the mean length of the two bonds participating in the angle. 
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FIG. A19: Persistence length £ P and radius of gyration R g vs. a multiplicative factor weighting the RE 2 
energies for a ssDNA sequence HETss,25 at ionic concentration of 0.04 mol/L, temperature 0.42e simu- 
lated for 240 ns. Note that the parameters, resulting from fitting the RE 2 potential to all-atom potentials, 
result in DNA whose persistence length and collapse is highly sensitive to external conditions. Such a 
sensitivity could be exploited in vivo by either varying the ionic environment or through histone phos- 
phorylation, acetylation, or methylation, for the purposes of gene regulation. Insets show representative 
snapshots taken from the simulations!!^. 



o 



CO 
Q_ 
<D 
CO 
CD 

CO 
JZ 
Q_ 
U) 
O 



CO 
CD 




5 10 15 20 25 30 35 40 
Twist (degrees) 



FIG. A20: Mean distance between phosphate groups as a function of helical twist. Data taken from the 
same simulations as in Fig. |ll| 
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